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ABSTRACT 

We present a clustering analysis of near ultraviolet (NUV) - optical color selected luminosity 
bin samples of green valley galaxies. These galaxy samples are constructed by matching the 
Sloan Digital Sky Survey Data Release 7 with the latest Galaxy Evolution Explorer source cat- 
alog which provides NUV photometry. We present cross-correlation function measurements 
and determine the halo occupation distribution of green valley galaxies using a new multiple 
tracer analysis technique. 

We extend the halo-occupation formalism, which describes the relation between galaxies 
and halo mass in terms of the probability P(N, M^) that a halo of given mass contains 
N galaxies, to model the cross-correlation function between a galaxy sample of interest and 
multiple tracer populations simultaneously. This method can be applied to commonly used 
luminosity threshold samples as well as to color and luminosity bin selected galaxy samples, 
and improves the accuracy of clustering analyses for sparse galaxy populations. 

We confirm the previously observed trend that red galaxies reside in more massive halos 
and are more likely to be satellite galaxies than average galaxies of similar luminosity. While 
the change in central galaxy host mass as a function of color is only weakly constrained, 
the satellite fraction and characteristic halo masses of green satellite galaxies are found to be 
intermediate between those of blue and red satellite galaxies. 
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1 INTRODUCTION 

Most nearby galaxies fall into one of two well-known and well- 
characterized categories. They are either passively evolving ellipti- 
cal galaxies with old stellar populations, red in color and typically 
living in high-density regions, or they are actively star-forming spi- 
ral galaxies with blue color. The latter often are field galaxies or 
reside in other low-density regions like cluster outskirts. 

This blue/red galaxy color bimodality has been observed to 
be in place already around z ~ 1 . The fraction of red galaxies in- 
creases with time (e.g., |Faber et al.||2007[ > and therefore galaxies 
must transition from blue to red. Galaxies in this transitional stage 
characteristically show low levels of recent star formation. As ultra- 
violet emission is a sensitive tracer of recent star formation, these 
transition galaxies are easily identified in a (N UV - r)-M r color- 
magnitude diagram where they populate a "green valley" between 
well-localized red and blue sequences fWyder et al.12007) . 

The relation between galaxy color and environment density 
also evolves with redshift, such that the fraction of red galaxies in- 
creases with time in dense environments but stays nearly constant 
for field galaxies (e.g., Cooper et al. 2007, and references therein). 
This indicates the transition from blue to red galaxies may be driven 
by environmental processes, associated with the infall of a galaxy 
into a larger halo ("cluster"). Proposed mechanisms broadly fall 
into one of the following categories: galaxy-galaxy interactions, 



such as galaxy mergers, merger driven nuclear activity and high 
speed galaxy interactions, galaxy-intra cluster medium interactions 
(e.g., ram pressure stripping or thermal evaporation), and interac- 
tions between an infalling galaxy and the cluster potential (e.g., 
truncation through tidal forces). Observationally these are disen- 
tangled through their characteristic timescales, the dependence of 
their respective efficiencies on halo mass, and position within the 
cluster jTreu et al.|2003]|Cooper et al.|2006{|Moran et al.|2007| >; 
for example, galaxy mergers are expected to be one of the domi- 
nant processes in group-scale halos and in the outskirts of massive 
clusters. 

In the framework of A cold dark matter (CDM) cosmology, the 
evolution and spatial distribution of dark matter halos is relatively 
well understood. A common technique for inferring the masses of 
halos hosting different galaxy populations is to measure the angu- 
lar or spatial clustering of galaxies and relate it to the predicted 
clustering and abundance of dark matter halos. While the relation 
between galaxy and dark matter clustering on large scales can be 
approximately described by scale-independent biasing, the situa- 
tion is more complicated - and more informative about the physi- 
cal processes at work - on small scales: At the level of individual 
halos, so-called halo-occupation distribution (HOD) models (e.g., 
Berlind and Weinberg 2002 1 describe the relation between galax- 
ies and mass in terms of the probability that a halo of given mass 
contains N galaxies of a given type. Then galaxy clustering, for ex- 
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ample the two-point correlation function, is modeled as the sum of 
contributions from galaxy pairs residing in the same halo and from 
galaxy pairs living in different halos. 

This method of interpreting galaxy correlation functions has 
been used extensively: For example, Zehavi et al. (2011,, see ref- 
erences therein for previous/high-z studies) analyze the completed 
(DR7) Sloan Digital Sky Survey (SDSS), and find, in agreement 
with previous results, that at the amplitude of the correlation func- 
tion increases with luminosity, and that at fixed luminosity red- 
der galaxies are more strongly clustered, due to redder galaxies 
being satellites in more massive (and thus more biased) halos. 
Based on correlation function measurements over the redshift range 
0.2 < z < 1.2 from the Canada-France-Hawaii Telescope Legacy 
Survey, Coupon et al. (201 1) also find red central galaxies to reside 
in more massive halos than average central galaxies in the same 
luminosity sample. 

The clustering of (NUV - r) color selected galaxies from the 
Galaxy Evolution Explorer (GALEX) survey has previously been 
studied by Hein is et al.| ( |2007[ >, who measure the angular correla- 
tion function; |Heinis et al.H2009) and |Loh et al.|j2010) analyze 
spatial clustering as a function of star formation history and color 
respectively. These authors find the clustering of green galaxies to 
have intermediate strength compared to blue and red galaxies and 
to have a scale dependence closer to that of red galaxies. At small 
scales their analysis is strongly limited by statistics due to the small 
number density of green valley galaxies, limiting their ability to 
constrain the 1-halo term. 

We extend the HOD formalism to simultaneously model the 
cross-correlation functions (CCF) of a sparse luminosity bin galaxy 
sample with multiple more abundant galaxy populations to study 
the environment of local green valley galaxies. We consider lumi- 
nosity bin samples of green valley galaxies as the physical mech- 
anisms populating the green valley, i.e., quenching star forma- 
tion in blue galaxies or rejuvenating red galaxies, may depend on 
halo mass and thus vary with galaxy luminosity. Compared to an 
autocorrelation function based clustering analysis, measuring the 
CCF between (sparse) GALEX selected galaxies and more abun- 
dant samples of SDSS galaxies reduces the shot noise contribu- 
tion to our measurements, and also increases the effective volume 
probed beyond the combined GALEX-SDSS footprin t] Extending 
previous work on HOD models for CCFs (e.g., Krumpe, Miyaji and 
|Coil|2010| > to simultaneously fit the clustering of the galaxy sam- 
ple of interest with respect to multiple tracer populations is particu- 
larly helpful for analyzing the clustering of luminosity bin samples, 
which are harder to constrain than the more frequently used lumi- 
nosity threshold samples.This allows us to put the separate piece 
of information found by |Heinis etaT1 ( |2009) and |Loh et aT|j2010) 
into a coherent analysis including HOD modeling, and improve the 
statistics due to the larger survey area included in the newest data 
release. 

Throughout this analysis we assume a fiat ACDM cosmology 
with Q m = 0.3 and <x 8 = 0.8. Unless specified otherwise, all dis- 
tances are coming and quoted in Mpc//i, and all absolute magnitude 
are given in h = 1 units. 



2 DATA 
2.1 SDSS 



The Sloan Digital Sky Survey ( York et al. 2000) mapped most 
of the high-latitude sky in the northern Galactic cap using a ded- 
icated wide-field 2.5 m telescope at Apache Point Observatory 
( |Gunn et al.l|2006) with the SDSS camera ( |Gunn et al.||T998l >. 
The raw imaging data were processed by a series of pipelines per- 
forming photometric calibration (Hog g et al.||2001 
2004 Tucker et al. 2006), photometric reduction ( 
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2001 ), and astrometric calibraton ( |Pier et al.|2003| >. Data release 7 
(DR7[Abazajian et al. 2009) of the spectroscopic sample provides 
(u'g'r'i'z')-photometry (Fukugita et al. 1996, S mith et al.||2002| 
and spectra for nearly 900000 galaxies with m r < 17.77 over 8000 
square degrees. These galaxies were selected from the photometric 
survey for spectroscopic follow-up using specific algorithms for the 
main galaxy sample ( Strauss et al. 2002) and luminous red galaxies 
(|EisensteirTet al. 2001). The main spectroscopic galaxy sample is 



1 We note that the increase in effective volume is limited to those regions of 
the SDSS footprint that are closer to the combined GALEX-SDSS footprint 
than the largest scales probed by the CCF. Due to the patchy geometry of 
the GALEX footprint, these regions cover nearly the entire SDSS footprint. 



nearly complete to r < 11.11 and has a median redshift of z ~ 0.1. 
Based on these observations, the NYU Value Added Galaxy Cat- 
alog (VAGC, [Blanton et al.|2005] > contains galaxy samples which 
have been constructed for large-scale structure studies: all magni- 
tudes are re-calibrated (Padmanabhan et al. 2008) and K-corrected 
(Blanton et al. 2003a), and the radial selection function and angu- 
lar completeness are carefully determined from the data. We restrict 
this sample to ;n, < 17.6 to ensure uniform completeness of faint 
galaxies across the survey area. 

Due to fiber placement in the SDSS spectrograph ( |Blanton| 
|et al.|2003b| ), galaxies closer than 55" cannot be observed on the 
same spectroscopic plate, and hence no redshifts have been mea- 
sured for about 7% of all targeted galaxies . The lack of observed 
close galaxy pairs affects the measured correlation functions on 
small scales. While it is possible to correct for fiber collisions down 
to 0.01 Mpc/h ( |Li et al.|2006fr , the number density of green valley 
galaxies is too small to obtain correlation function measurements 
at such small separations and we simply assign galaxies with miss- 
ing spectra the redshift of its nearest neighbor. This method has 
been shown to work well for projected correlation functions above 
the scale corresponding to 55" {Zehavi et al.|2005) . For the most 
distant galaxies in our sample the fiber collision scale is 0.07 co- 
moving Mpc/h and we measure correlation functions only on per- 
pendicular scales r p > 0.1 Mpc/h. 

Spectral line measurements and mass estimates for these 
galaxies are taken from the MPA-JHU catalogJ^We use the former 
to classify the (NUV - r) selected transitional galaxies with emis- 
sion line diagrams and to compare (NUV - r) color selection with 
spectroscopic separation of active and quenched galaxies based on 
£> n 4000 (Fig. [8j. Note that these quantities are estimated from a 
fiber size of 3", and due to low redshift of our galaxy sample these 
measurements may not be representative of the luminosity averaged 
properties of a galaxy but rather be dominated by central (bulge 
dominated) regions. 



2.2 GALEX 

NUV photometry for this project is taken from the GALEX 
Medium Imaging Survey (MIS) Source Catalog (GMSC, Seibert 
et al. in prep.) derived from the GALEX GR6 data release, which 
provides unique measurements of point and extended sources up 
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to 1 arcminute diameter in the GALEX bands (Seibert at al., in 
prep.). The NUV source catalog covers 4827 square degree at 
/i c ff = 2316A with a resolution of 5.3" and reaching a depth as 
23 mag. 

GALEX has a circular field of view of 1.2° which is sam- 
pled at 1.5". Each field targets a pre-defined position on the sky, 
resulting in a hexagonal tiling of the survey. These angular selec- 
tion parameters are contained in exposure time, coverage and flag 
maps in HEALpix ( [Gorski et al.||2005) format accompanying the 
GMSC, which we use to define the combined footprint and select 
our galaxy sample as detailed in section|2~3| 



Table 1. Cross-match sample definition" 



Parameter 



Limits 



r-band magnitude 
redshift 

GALEX field radius 
GALEX exposure time 
NUV flag 
NUV magnitude 

SDSS/ NUV angular completeness 



14.1 <r< 17.1 
0.02 < z < 0.2 

fov-radius < 0°.55 
t > 1000 s 
nuv-artifact < 1 
16.0 < NUV < 23.0 

,/comp - > 0.7 



"The parent catalog is the NYU VAGC dr72bright. 



2.3 SDSS-MIS Cross-Match 

In order to match the VAGC with NUV detections, we first con- 
struct the combined footprint of these two surveys. This is done by 
converting the VAGC angular selection function, which is given in 
terms of Mangle polygons (|Hamilton and T egmark 2004), into the 
pixelized HEALpix format ( Swanson et al. 2008| >. Then we multiply 
the angular selection functions of the VAGC and MIS in each pixel 
(at resolution N s a e = 2048) and restrict the overlap region to pixels 
where the angular completeness fraction of both surveys is larger 
than 0.7. This results in a combined survey with an effective area 
of 2708 square degrees. Furthermore, we require tiles to have NUV 
exposure times t > 1000 s, which reduces the combined effective 
area to 1945 square degrees. This final overlap region is shown in 
black in Fig. [I] 

We cross-match all galaxies in the VAGC within this overlap 
area with NUV detections using a search radius of 4". In order to 
construct a complete statistical sample, we then restrict the cross- 
match with various cuts summarized in table [T] Due to deblending 
and centering issues for nearby or very bright objects, the NUV and 
r band photometry pipelines may report positions for these objects 
that are farther separated than the matching radius, leading to spu- 
rious non-detections. Furthermore, the astrometric and photomet- 
ric precision of the GALEX detections declines toward the edges 
of each tile, and near light echos and other imaging artifacts and 
we exclude this regions as detailed in table [T] The color-apparent 
magnitude distribution and completeness of the final cross-match 
sample is shown in Fig. [2] For apparently bright galaxies (ra r < 16) 
the blue sequence (around (NUV-r) = 2 - 3) and the red sequence 
(around (NUV — r) x 5 - 6) are clearly visible. No galaxies are 
found with (NUV - r) > 6.5 though these should well be within 
the GALEX detection limit (indicated by the inclined line) at these 
magnitudes if they existed. For these bright galaxies far from the 
NUV detection limit the cross-match completeness is around 90%, 
it decreases for fainter objects as the NUV detection limit moves 
into the color-magnitude space occupied by red galaxies. In order 
to retain a nearly complete sample of green valley galaxies we cut 
the cross-match sample at m r < 17.1. The resulting cross-match 
catalog has a completeness of 76%, i.e. 76% of galaxies in the 
VAGC catalog, that meet the magnitude and redshift criteria de- 
scribed above are at a position with GALEX coverage as detailed 
in table[T] have a m^uv < 23.0 GALEX detection. 

Finally, we use kcorrectv4.2 (Blan torTand Roweis|2007) to 
calculate absolute NUVo.i magnitudes of the cross-match galaxies 
k-corrected to z = 0.1. As the redshift evolution in the NUV is 
not very well constrained, we do not attempt to apply evolution 
corrections to the NUV nor optical magnitudes. Similarly, we do 
not attempt to correct the (NUV - r) colors for intrinsic extinction. 
To isolate transitional galaxies and avoid identifying dusty (edge 




Figure 1. Combined SDSS + GALEX MIS footprint. The area covered by 
the VAGC at an angular completeness /c 0rap > 0.7 is shown in red, the final 
overlap area of 1945 square degrees between VAGC and MIS, as detailed 
in section|2.3| is shown in black. 



on) spiral galaxies as green valley objects, we only consider objects 
with r-band isophotal axis ratio b/a > 0.5. 



3 SAMPLE DEFINITION 

In order to work with well-defined galaxy populations, we con- 
struct a number of volume-limited samples. As the properties of 
green valley galaxies may vary with luminosity, we define sam- 
ples of width 0.5 in absolute magnitude, and find the redshift range 
over which all galaxies in this sample have apparent magnitudes 
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Figure 2. Completeness of the cross-match sample.Le/iV Apparent 
magnitude-(A'tVV* - r) color diagram. Black dots show a random subset 
of VAGC galaxies with NUV cross-match. Red dots indicate VAGC galax- 
ies without NUV detections, which have been placed at the detection limit 
NUV = 23 and corrected for position dependent galactic extinction. 
Right: Completeness of the NUV cross-match as a function of apparent 
magnitude. 
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Table 2. Volume-limited galaxy sai 



Green Valley sample 



SDSS samples 



0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 



Figure 3. Volume-limited color selected galaxy samples: Black dots show 
a random subsample of VAGC galaxies with m T < 17.6, subsampled by a 
factor 10. Green symbols indicate green valley galaxies identified based on 
their (NUV - r) color, which are restricted to 14.1 < m r < 17.1 to en- 
sure (near) completeness of the cross-matched sample. Red boxes indicate 
the location of volume-limited color selected galaxy samples in luminosity 
redshift space. 
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Figure 4. Definition of volume-limited SDSS galaxy reference samples: 
Light-gray dots show a random subsample of VAGC galaxies with m r < 
17.6, subsampled by a factor 10. The red box indicates the location of 
[-20.5,-21] magnitude range volume-limited color selected galaxy sam- 
ples. The dark gray points show the extend of the volume limited "faint" 
SDSS galaxy reference sample associated with this color selected galaxy 
sample, the black dots illustrate the associated "bright" luminosity thresh- 
old reference sample. The definitions are analogous for other magnitude 
ranges, hence we show only one example to improve clarity of the plot. 



14.1 < m r < 17.1 (the magnitude range of the cross-matched cat- 
alog), c.f. Fig. [3] The VAGC has less stringent apparent magni- 
tude requirements (10 < ra r < 17.6), and we define two samples 
of SDSS galaxies occupying the same volume as each luminos- 
ity bin sample of NUV detected objects, which are used for the 
cross-correlation analysis. These samples are described in detail in 
table[2] Specifically, for the luminosity bin [M l rain , M lraax ] we de- 
fine the "bright" sample of SDSS galaxies to contain all galaxies in 
the same redshift range brighter than M r-max , and the "faint" sample 
to consist of the volume-limited sample [M r min + 0.5, M rmax ]. The 
definition of these samples luminosity redshift space is illustrated 
in Fig. [4] We refer to the union of these two samples, which is a lu- 
minosity threshold sample with threshold M r min + 0.5, as the SDSS 
"all" sample. 
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Note. — The first two columns give the magnitude range 
[A^r,min> M,max] and mean redshift of the green valley galaxy sam- 
ples illustrated in Fig. [3] Ng is the number of green valley galax- 
ies in this sample, and h G their mean comoving density per 
10~ 3 (Mpc//j) 3 . N t and N b are the number of SDSS galaxies in 
the faint and bright sample in the same volume; the bright sam- 
ple consists of galaxies in the same volume that are brighter than 
M um „ and the faint sample contains galaxies in the magnitude 
range [M„ rain + 0.5, M r 

.max J - 

3.1 Finding the Green Valley 

We define the location of the green valley in (NUV - r) color- 
magnitude space by fitting blue and red sequences to the color dis- 
tribution of each volume-limited sample. We include galaxies with- 
out NUV detections, which otherwise meet all cross-match criteria 
and are optically red ((g-r) > 0.8), by placing them at the NUV de- 
tection threshold, correcting for position dependent galactic extinc- 
tion and assigning the mean k-correction of cross-matched galaxies 
which are within A(NUV - r) = ±0.1 mag, AM, = ±0.1 mag, and 
Az = ±0.02 of the unmatched galaxy. We the find the center and 
scatter of the color sequences by fitting each sequence with a Gaus- 
sian. Initially, we cut the distribution at (NUV - r) = 4.2 and fit a 
Gaussian to each side. We then iteratively adjust the fitting range 
to include the galaxies within 1 cr of the peak location on the ridge 
toward the Green valley. The best-fit parameters for each sample 
are show n in Fig. [6] alo ng with fits to the blue and red sequence ob- 
tained by Wyder et al. ( 2007), which are based on a different fitting 
scheme and one continuous galaxy sample weighted by the v max 
method instead of using disjunct volume-limited samples. As we 
include NUV non-detections, which are unaccounted for by Wyder 
|et al.| < [2007] l, our red sequence is slightly redder for faint galaxies, 
but otherwise these results agree very well. 

The black error bars in Fig.[5]illustrate the mean photometric 
uncertainty in the (NUV - r) color of blue/red galaxies, suggesting 
that asymmetric scatter into the green valley due to photometric 
uncertainties is small compared to the intrinsic scatter of the red 
sequence. 



3.2 Sample Properties 

In order to facilitate the comparison with other studies of tran- 
sitional galaxies based on optical criteria, we characterize the 
(NUV - r) selected galaxies in other parameter spaces. 

Figure [7] and Fig. [8] show the distribution of (NUV - r) se- 
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Figure 5. Comoving density of the volume-limited galaxy samples as a function of (NUV - r) color. Solid histograms show all NUV detected galaxies. 
The dotted histograms include NUV non-detections, which otherwise meet all cross-match criteria and are optically red ((g — r) > 0.8), placed at the NUV 
detection threshold, corrected for position dependent galactic extinction and assigned the mean k-correction of cross matched galaxies which are within 
A(NUV - r) = ±0. 1 mag, AM,- = ±0. 1 mag, and Az = ±0.02 of the unmatched galaxy. The solid line shows the double Gaussian fit to the blue side of the blue 
sequence and the red side of the red sequence, as described in |3.1| and the vertical blue and red lines show the lcr ridge of the color sequences derived from 
these fits. The colored error bars also indicate the l<x scatter of the color sequences centered on their respective peak. The black error bars illustrate the mean 
photometric uncertainty in the (NUV - r) color of blue/red galaxies. 
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Figure 6. Defining the green valley: Symbols and error bars show the loca- 
tion and scatter of the blue and red sequence from the fits in Fig. [5] Lines 
show the best-fit sequences from Wyder et al. (2007 1 transformed to our 
magnitude units. 
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Figure 7. Colored histograms show the distribution of (NUV - r) selected 
blue/green/red galaxies in (g - r) space. The black histogram shows the 
distribution of all SDSS galaxies in the volume-limited sample, but not re- 
stricted to the combined footprint. The vertical line shows the color cut 
separating blue and red galaxies from Zehavi et al.|{201 1) . 
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Figure 8. Same as Fig.|7]but for D n 4000. The vertical line shows the sepa- 
ration between quenched (D„4000 > 1.6 and star forming galaxies used in 
|Tinker, Wetzel and Conroy|j201 1) . 



lected galaxies in (g — r) color space and as a function of the Balmer 
break index D n 4000. Here the red sample again includes NUV non- 
detections as described in section [3~T1 The vertical lines indicate 
the transition between blue/red and star forming/quenched galaxies 
based on (g - r) and D„4000 respectively. Most faint (NUV - r) 
selected green valley galaxies are optically blue and and would be 
classified as star forming by both of these criteria. On the other end, 
a large fraction of luminous, (NUV- r) selected transitional galax- 
ies would be classified as red/quenched by both of these criteria. 

Furthermore, Fig.|9]shows the distribution of stellar masses as 
a function of (NUV —r) color. The stellar masses are taken from the 
MPA-JHU catalog and are based on |Kauffmann et al.| ( |2003b) . At 
fixed luminosity, green valley galaxies and red sequence galaxies 
have similar stellar masses. 

We illustrate the distribution of green valley galaxy spectra for 
different luminosity bins in Flg.[l0] The thick line shows the mean 
spectrum obtained from stacking all green valley galaxies (with r- 
band isophotal axis ratio larger than 0.5) within Az = 0.02 of the 
mean redshift of each luminosity bin. The individual spectra are 
normalized to the median flux in the 410-500 nm range, giving each 
galaxy equal weight. The thin gray lines show smoothed individ- 
ual spectra of 25 galaxies randomly chosen from those used in the 
stacking process. While we use the spectra mask to exclude pixels 
flagged by the SDSS spectra reduction pipeline, these spectra con- 
tain residual atmospheric [OI] and OH. Note that the fiber diameter 
of 3" roughly corresponds 1.5 kpc//? and z = 0.036, to 3 kpc//7 at 
z = 0.083, and to 4.8 kpc/h at z = 0.13. The stacked spectra show 
that, on average, green valley galaxies have red bulges and some 
amount of AGN activity. All spectra show H„, or a combination of 
H„ and [Nil], emission, which we classify further using emission 
line diagnostics in Tab. [3] For green valley galaxies with emission 
line measurements with S /N > 3 the AGN fraction is substantial, 
especially among the more luminous ones. Note that we use emis- 
sion lines from the MPA-JHU with rescaled flux errors. However, 
in particular for the less luminous samples at lower redshifts there is 
considerable spread among objects, limiting the informative value 
of the stacked spectra. 
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Table 3. Classification of Green Valley Galaxies 
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Figure 9. Same as Fig.^Jbut for stellar mass. 
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* Fraction of green valley galaxies with H a emission detected at 
S/N> 3 

Traction of green valley galaxies classified as star forming (/sf), 
composite (/ comp ), or AGN (/agn) based on the [NIIJ/H^ vs. 
[OIIIJ/H/j Baldwin, Phi llips and Terlevich|(T98"T) emission line dia- 
gram, using the Kewley et al.|( " 
l et al| 



line and the Kauffmann s 



2001 1 extreme starburst classification 
2003al pure star formation line 



d Fraction of galaxies with low signal-to-noise (S /N < 3) in at 
least one of these emission lines, not included in the emission line 
classification 




Figure 10. Stacked spectra of (NUV-r) selected green valley galaxies for 
different luminosity bins as a function of restframe wavelength. The thin 
gray lines show 25 randomly chosen individual spectra, boxcar smoothed 
over 10 pixel to enhance readability. 



4 CLUSTERING ANALYSIS 

4.1 Projected Correlation Functions 

To separate spatial clustering from redshift space distortions, we 
first measure the correlation functions in radial direction n and per- 
pendicular direction r p and then project out redshift space distor- 
tions. Specifically, we measure the (cross-)correlation function of 
galaxy samples D X j using the |Landy and Szalay| ( |1993} estimator 
and its generalization for cross-correlation functions jSzapudi and| 
|Szalay|1998| > 



DxDy — DxRy ~ DyRx + R-xRy 



RyRy 



{r p ,n), (1) 



on a two-dimensional grid. Here R Xz y are associated random cat- 
alogs, DD(r p ,7r), DR(r p , tt) and RR(r p ,n) are the (normalized) 
number of data-data, data-random, and random-random pairs at 
separation (r~ p ,nj. We adopt linear binning in the radial compo- 
nent, logarithmic bins in perpendicular distance and measure the 



projected (cross-)correlation function as 

wxrOp) = 2 I dn£ xy (r p ,n); 



(2) 



with n„ 



50Mpc//i. 



4.2 Measurements 

We generate random catalogs with the SDSS angular selec- 
tion function and the angular selection function of the GALEX- 
SDSS cross-match catalog. As we have constructed volume-limited 
galaxy samples, and their color selected subsamples, with narrow 
redshift ranges allowing us to ignore redshift evolution effects, the 
random catalogs have uniform comoving density and do not need 
to account for the radial selection function. The random catalogs 
are oversampled compared to the galaxy catalogs by a factor 25 for 
SDSS samples, and by a factor 100 for the sparser (NUV - r) se- 
lected samples. Increasing the the oversampling rate by a factor of 
two has no significant impact, indicating that the correlation func- 
tion estimates have converged. 

Figure [TT] demonstrates that we have characterized the com- 
bined survey geometry sufficiently well to measure correlation 
functions in this patchy survey geometry. Here we show the correla- 
tion function between a galaxy sample in the full SDSS footprint in 
the magnitude bin [-19.5, -20] and blue color ((g - r) < 0.8) with 
different subsets of itself: The dashed line shows its auto correla- 
tion function. Next we consider the cross-correlation between this 
sample and its restriction to the footprint of the SDSS + GALEX 
combined catalog, which is shown by the dotted line. Compared 
to the full auto correlation function, this cross-correlation function 
may be affected by boundary effects associated with the correla- 
tion function estimator or finite volume effects, as we have reduced 
the volume probed by of one copy of the galaxy catalog by a fac- 
tor of four. Note that in this case the angular selection function in 
the combined survey area is still given by the SDSS angular selec- 
tion function. Next we further restrict one copy of the galaxy cat- 
alog to galaxies with NUV detections, shown by the solid line. As 
the galaxy sample consists only of blue galaxies, these should all 
have NUV detections, and any significant differences between the 
dotted and solid line would indicate a mis-characterization of the 
combined angular selection function. One copy of the galaxy cata- 
log stays the same throughout the process, so that we measure the 
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cross-correlation between samples with different footprints, which 
leads to better statistics and smaller finite volume effects than re- 
stricting the SDSS data to the combined footprint region as well. 

As described in detail in Ze havi et al.| ([201 1), the clustering of 
the faintest SDSS luminosity threshold samples is subject to sub- 
stantial sample variance effects due to the small volume probed by 
these low-redshift samples. As we are interested in a sparse subpop- 
ulation of these samples and are furthermore restricted to one fourth 
of the SDSS footprint area, these sampling effects are even more se- 
vere in our analysis. After reproducing the sub-volume tests of |Ze-| 
|havi et aL] ( |201 1[ >, we find that the magnitude bin [-19.5, -20] is the 
smallest sample for which we can obtain robust correlation function 
measurements. Examples of measured auto- and cross-correlation 
functions for SDSS galaxy samples and green valley galaxies are 
shown in Fig.[T2] For comparison, we also show measurements the 
green valley galaxy auto correlation function, for which we used 
random catalogs with an oversampling factor of 1000. 

We estimate the covariance of our correlation function mea- 
surements using bootstrapping with "oversampling of subvolumes" 
(Norberg et al. 2009 ) with an oversampling factor of 3, where num- 
ber of subvolumes chosen with replacement N t is equal to three 
times the number of subvolumes the data set is divided up into, 
Maub- INorberg et a l. (2009) find that this method gives robust error 
estimates that are in agreement with external estimates from mock 
catalogs. For correlation functions between two SDSS galaxy sam- 
ples, we divide the SDSS footprint into 150 subsets of equal area. 
For correlation functions between one SDSS galaxy sample and 
one sample restricted to the combined footprint area, the division 
into equal area subsets is not clearly defined, and we choose sub- 
sets which contain equal number of random-random pairs at angu- 
lar separation of 2° in order to evenly sample the cross-correlation 
function on scales of a few Mpc//j. Due to the smaller effective area 
of this restricted geometry, we only have 50 such subareas. Exam- 
ples for both types of covariances are shown in Fig. [T3] As noted 
by Hartlap, Simon and Schneider ( 2007 1, estimated covariances are 
a biased estimate of the inverse covariance with the bias depending 
on the number of data points,/?, and the number of independent data 
sets, n. If the mean is estimated from the data, an unbiased estimate 
of the inverse covariance is given by 



n- 1 



2 -c- 



(3) 



As bootstrap realizations are not independent, we cannot apply 
Eq. |3]directly with n = N T . Instead, we assume 



(4) 



and follow the calibration method described in Eifler, Kilbinger 



and Schneider i2008i: We measure tr(C~') repeatedly varying N T 
with constant binning and oversampling rate, and determine m as 
the slope of l/tr(C~') with in p/N t . Specifically, we varied N r 
using iV sub =(120,135,150,165,180) for the SDSS footprint, and 
yV sub =(40,45,50,55,60) for the GALEX-SDSS footprint. 

We were unable to obtain stable, invertible covariances for the 
most luminous green valley galaxy sample. Hence we restrict our 
analysis of this sample to large scales (section |4~3]l where it was 
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possible to measure converged and invertible covariances. 



Figure 11. Test of survey geometry effects on measured correlation func- 
tions. Different lines show the projected cross-correlation function between 
galaxies in the full SDSS footprint in the magnitude bin [-19.5, -20] and 
with (g - r) < 0.8 (A) with the same sampie, (B) with the sample restricted 
to the combined survey area, (C) with GALEX detected galaxies in the same 
magnitude and color bin. 



fitting the projected correlation functions with theoretical matter 
correlation functions times a linear bias factor. Specifically, we fit 
measured correlation functions over the range 3-25 Mpc//i to the 
theoretical predictions for the projected matter correlation function, 
including the full data covariance. Figure[l4] shows the resulting lu- 
minosity bias relation. The top two plots are for binned and thresh- 
old samples of SDSS galaxies, and the lines are fits from the anal- 
ysis of galaxy clustering in SDSS DR7 by |Zehavi et al.| ( |2011| >. 
Overall, we find good agreement with their results. The M r < -20 
galaxy threshold sample and it subsamples deviate from the best- 
fit bias relation. As detailed in Tab. [2] these samples are centered 
around the redshift of the Sloan Great Wall, which leads to excess 
clustering in this and neighboring samples^This effect is enhanced 
in the lower plots, which show bias as a function of (NUV - r) 
color and luminosity or mean stellar mass. Here the clustering of 
red galaxies is strongly enhanced in the Sloan Great Wall. 



5 HALO-OCCUPATION DISTRIBUTION MODELING 

At the level of individual halos, a halo-occupation distribution 
(HOD) model (e.g., Berlind and Weinberg 2002) describes the re- 
lation between galaxies and halo mass in terms of the probability 
P(N, M h ) that a halo of given mass M h contains N galaxies. To de- 
scribe the two-point clustering of galaxies, we need models for first 
and second moment of the HOD, (N\M h ) and (N(N - l)|M h >. Fol- 
lowing Zh eng et al.| (|2"005), we separate galaxies into central and 



4.3 Results: Large-Scale Bias 

Based on the correlation function measurements described in the 
previous section, we can measure the large-scale galaxy bias by 



3 This was also noted by |Zehavi et al.||201l) who exclude the redshift 
range of the Sloan Great Walt from their analysis of luminosity bin galaxy 
samples 
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Figure 12. Examples of measured cross-correlation functions. For comparison, we also show measurements the green valley galaxy auto correlation function. 
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Figure 13. Sample covariances. Top: Covariance between the different auto- and cross-correlation functions of the SDSS faint and bright sample associated 
with the magnitude bin [-19.5, -20] . Bottom: Covariance of the cross-correlation function between the [- 19.5, -20] green valley sample and the corresponding 
SDSS faint and bright samples. 

In each block of these covariances perpendicular scales increase from left to right and bottom to top. 



satellite galaxies. By definition, a halo contains either zero or one 
central galaxy, and it may host satellite galaxy only if it contains a 
central galaxy, which motivates the form 



(N(M b )) = (N c \M h )(l + (N s \M b )), 



(5) 



with (A' c / s |M h ) the average number of central/satellite galaxies in a 
halo of mass M^. 



5.1 HOD Parameterization 

While the assumptions in a HOD model describing the properties of 
dark matter halos are generally agreed upon (see section [5^2| for de- 
tails), the form of the relation between galaxies and halos (equation 
l|6j) is less well constrained and leaves more room for experiments. 
We motivate the details our implementation next. 



5.1.1 SDSS Samples 

We base our model for SDSS galaxy samples on the HOD parame- 
terization of Zehavi et al. ( 201 1 ) for luminosity thresholds samples 
with absolute r-band magnitude M r < M' r , 



(N(M h \M' r )) = 



1 + erf 



flogM h -logM'. 



\ogM 



l + 



M h - M\ 
M'l 



(6) 

with model parameters M' min , M { , M'', o"] ogM , a,. The central galaxy 
occupation function is a softened step function with transition mass 
scale M l . , which is the halo mass in which the median central 

mm 1 

galaxy luminosity corresponds to the luminosity threshold, and 
softening parameter <x| o m which is related to the scatter between 
galaxy luminosity and halo mass. The normalization of the satel- 
lite occupation function, MJ', and cut-off scale M l are related to 
M t , the mass scale at which a halo hosts at least on satellite galaxy 
(N s (Mi) = 1)); finally a, is the high-mass slope of the satellite 
occupation function. This parametrization was found to reproduce 



the clustering of SDSS and CFHTLS galaxies (Coupon et al. 2012) 
well over a large range of luminosity thresholds and redshifts. 

The HOD model for a binned galaxy sample with M', 2 < M r < 
M'^ is typically calculated from model fits to luminosity threshold 
samples as 



(N{M h \M\},M t2 )) = (Nm\K)) ~ (N(M h \M l2 )} . (7) 



While we note that the results of Zehavi et al. (201 1) favor a some- 
what steeper slope of the satellite distribution for the most lumi- 
nous galaxy samples in our analysis, we set a = 1 for all SDSS 
galaxy samples. This is in overall agreement with previous results 
for the luminosity range of interest, and makes differencing the 
HOD of neighboring samples numerically more stable. Hence our 
model has 4 free parameters for a luminosity threshold sample, 
and 8 free parameters for a luminosity bin sample. Without fur- 
ther constraints, such a parameterization of luminosity bin samples 
has too many degrees of freedom for general applications. How- 
ever, it has the advantage that the HODs of neighboring luminosity 
bins are consistent with each other, and we use this parameteriza- 
tion to fit the different correlation functions among our SDSS faint 
and bright samples, resulting in 8 parameters for the SDSS HODs 
in each volume-limited sample. 

Furthermore, we assume the radial distribution of all color in- 
dependent galaxy samples to follow the dark matter distribution. 
This assumption is supported by the results of Watson et al.H2012| > 
who studied the small scale clustering of SDSS galaxies. While 
these authors found an enhanced clustering of luminous galaxies 
on small scales (r p < 0.05Mpc/h) compared to an NFW distri- 
bution, their galaxy correlation function measurements agree very 
well with the predicted dark matter clustering over the radial scales 
and luminosity range of interest for our analysis. 
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Figure 14. Linear galaxy bias measurements obtained from fits to the large-scale correlation function. Top: Linear bias as a function of luminosity for different 
luminosity bin samples with bin width AM r =1.0 (left), and threshold samples (right). The lines show best-fit relations from| Zehavi et al.|j20l 1} , 
Bottom: Linear bias as a function of (NUV - r) color and luminosity (left) or stellar mass (right), for galaxy samples with luminosity bin width AM r = 0.5. 



5.1.2 Luminosity and Color bin Samples 

For a (NUV - r) selected galaxy sample (X), which is measured 
in one narrow 0.5 mag bin per sample volume, we need a more 
compact description of the HOD and we model the central galaxy 
term as a clipped Gaussian, 

A x i— /-(logM h -logM r x ) 2 \ 
<jV c (M h ,X)) = min(^V2^exp s h s C M ,1), (8) 
ox \ 2o~ x ) 

with free parameters A x , cr x and . Here the clipping enforces 
that a halo does not have more than one central galaxy. 

The auto correlation function of color selected galaxies by def- 
inition is only sensitive to galaxy pairs of the same color. Hence 
HOD models require assumptions on the relation between the col- 
ors of central and satellite galaxies, and in particular need to ac- 
count for central galaxies which are not part of sample (e.g. |Si-| 
|monetal.|20 09, Skib ba and Sheth|2009") . In contrast, modeling the 
cross correlation between a color selected galaxy sample and the 
full (color independent) galaxy population with the same luminos- 
ity threshold does not require such assumptions. This allows us to 
simply write the condition that a halo has to contain a central galaxy 
in order to host satellite galaxies in terms of central galaxy occu- 
pation function of the full (color independent) luminosity threshold 



sample with luminosity threshold t x equal to the minimum lumi- 
nosity of the luminosity bin under consideration, 



1 



(N s (M h ,X))=A x - 



1 +erf 



log M h - log M 



At*, 



(9) 



which is characterized by two free parameter, Mf and a x . 

Note that the correlation function of a binned sample is inde- 
pendent of the normalization parameter A x , which is determined by 
the galaxy number density. 

Motivated by observations finding red satellite galaxies to be 
radially more concentrated than blue galaxies (e.g. | von der Linden] 
|et al.|20iO{|Guo et al.|2012) , we introduce another free parameter 
fx which describes the NFW concentration, c x , of a color selected 
galaxy sample relative to that of dark matter, 



cx(M h ) = f x c(M h ) . 



5.2 Relation to Correlation Functions 



(10) 



The halo model prediction for the real-space correlation function 
takes the form 



l+^) = (l+^ lh (r)) + (l+^ h (r)) , 



(11) 
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where (l+£ lh ) is proportional to the number of galaxy pairs residing 
in the same halo {one-halo term), and the two-halo term (1 + £ 2h ) 
is proportional to the number of galaxy pairs occupying different 
halos. The model real-space correlation function is related to the 
projected correlation function as 



w(r p ) = 2 



(12) 



We will now describe the computation of these terms in detail. In 
order to evaluate these expressions numerically, we define halos 
to enclose a spherical overdensity of 200 times the mean back- 
ground density and assume that their density distribution follows 
a NWF profile ( |Navarro, Frenk and White||1997fr with the halo 
mass-concentration relation of Bhattacharya, Habib and Heitmann 
furthermore we use the fitting functions of |Tinker et al 



2011 



2008J and |Tinker etaLlpOlOt for the halo mass function and halo 
bias relation. Unless stated otherwise, we assume that the galaxy 
distribution follows the halo density profile. 



5.2.1 One-Halo Term 

We split the computation of the one-halo term into then clustering 
of central and satellite galaxy £ 1,c-s and satellite-satellite clustering 
within the same halo. The central-satellite term is given by 

1+fiT'to =-=^=~ f dM b ^-({N c (M h ,X)N s (M h ,Y))p r (r\M h ) 
nxnyJ Mvii(r) dM h \ 

+ (N c (M x Y)Ns(.M b ,X))px(r\M b jj , (13) 

where dnjdM^ denotes the halo mass function, with px(r\M b ) the 
normalized radial distribution of galaxy population X within the 
halo, and with 



n x ■ 



f 



dM * -ITT (N(M h \X)) 
dM h 



(14) 



The term (N c (M b ,X)N s (M b ,Y)) in equation {13) is equal to the 
average number galaxy pairs with a central galaxy from sam- 
ple X and a satellite galaxy from sample Y in a halo of mass 
Mh. From the definition of satellite galaxy this term evaluates to 
(N c (M h , M].)N s (M h , M')) = {N s \M b ,M\)) for the auto correlation 
of a luminosity threshold sample (Zheng et al. 20051. However, 
when considering binned samples or cross-correlations between 
different samples, the central galaxy of a halo hosting satellite 
galaxies from the sample Y need not be from sample X, and we 
use (N c (M h ,X)N s (M h ,Y)) = {N c \M h , X) {N s \M h , Y) |M?yaji et al.| 
poTT) . 

If samples X and Y are disjunct, the satellite-satellite term is given 
by 



Jm, 



dM b -^- (N s (M h , X)N s (M h , Y)) 
rM dM h 



X(px*py) (r\M h ), 



(15) 



where (p x * py) (r\Mh) denotes the convolution of radial galaxy dis- 
tributions p x and p Y , and where the average number of satellite 
pairs is given by (N s (M b ,X)N s (M b , Y)) = (N s \M b ,X) (N s \M b , Y). 
To model auto correlations function, the number of galaxy pairs is 
modified to 



2 C^A dn (N s (M h ,X)(N s (M h ,X)-l)) 
z _ dM h -— 

nxn x J Myk(r) aM h 2 



X (px * Px) (r\M h ) . 



(16) 



Assuming that satellite galaxies are Poisson distributed, the 
number of pairs evaluates to (M,(./V s - 1)) = (N s ) 2 . 



5.2.2 Two-Halo Term 

On scales above ~ 5Mpc/A, the clustering of galaxies follows the 
large-scale clustering of dark matter halos, and it is modeled as 
function of the dark matter correlation function £ mm , 



tfy(r)> 



bx°yC m (r). 



(17) 



Here b x denotes the bias parameter of galaxy sample X, which we 
calculate as 

b x = ^ f dM h ^-b h (M h )(N(M h \X)) , (18) 
nx Jo dM h 

where b h is the halo bias parameter. 

On intermediate scales one needs to account for the distribu- 
tion of galaxies within different halos and halo exclusion, i.e., the 
fact that two halos contribution to the two-halo term do not over- 
lap. Following the spherical halo exclusion model of Tink er et al.| 
(20051, we restrict the calculation of the two-halo term at separa- 
tion r to halos with R vlr t + /f V ir,2 ^ r - The effect of the distribution 
of galaxies within the different halos on the correlation function is 
given by the convolution of their respective density profiles. As this 
requires convolving many different halo profiles, we calculate the 
two-halo term is calculated in Fourier space: 



Pfv(k,r) 



PJk) 



1 



J M 

Jm 



n' x n' Y (r) 

M >™-' (r) dn 

dM l —{N\M u X)b b (M,)p x (.k,Mi) 



m dM { 

"^lim,2(Ml,'') J„ 

dM 2 

where M r , m l is the maximum halo mass such that R v i r (M lim j) = 
'"-^vir(A/min) with M mm the minimum halo mass of the HOD, where 
M{im,2 is defined by i? v i,(M lira 2 ) = r - R v i r (Mn m ,i)> an d where p x 
denotes the Fourier transform of the normalized galaxy distribution 
px- n' x h' r (r) denotes the number density of galaxy pairs restricted 
to non-overlapping halos at separation r 



n x n' y (r) 



x 

Jm 



Mi im ,i(r) 

dM, 



dn 
dM 



f 

Jm 



dM 2 



dn 
~d~Mi 



(N\M\ , X) | dM 2 ■ 7rr <7V|M 2 , Y) . 

(20) 

The two-halo correlation function is obtained from the power spec- 
trum by 



f$to ■■ 



dkt 



s'm(kr) 
kr 



Pfy(k,r). 



(21) 



As ff^to has been obtained from a (radius -) restricted sample of 
galaxy pairs, it is converted to a probability for the whole sample 
by 



l+£rto : 



n' x n' y {r) 
h x h Y 



(l+^to) 



(22) 



5.3 Analysis 



As described in section |4T2| for each luminosity bin sample of inter- 
est we measure the projected auto and cross-correlation functions 
of the SDSS faint and bright galaxy samples 



(wff,Wfb,w b b) = W s 



(23) 
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where we have introduced the correlation function data vector 
w = (w(r Pj i), w(r p> 2), w(r p jf b .)) , and the cross-correlation be- 
tween (NUV - r) color selected luminosity bin samples and the 
two SDSS galaxy samples 



(Wjf f , w xb ) = W x , 



(24) 



with X e (blue, green, red)). 

Ideally one would fit all these cross-correlation functions si- 
multaneously, however this is not practicable: As the (NUV - r) 
selected galaxy samples are restricted to GALEX + SDSS overlap 
area, obtaining a joint covariance for the SDSS reference samples 
and the color selected sample (Cov(W(f, Wfb, Wbb, w X f, Wxt,)) would 
require restricting the SDSS clustering analysis to the combined 
SDSS + GALEX footprint, which would discard 75% of the SDSS 
area0 

Instead, we first model the SDSS correlation functions and 
galaxy number densities with an eight parameter HOD described in 
section |5". 1.1| and then fit the color bin sample HOD (section |5.1.2} 
using the model for the SDSS samples obtained in the previous 
step; using the full (non block diagonal) data covariances (Fig. 1 13} 
in each step. This method assumes that the color sample - SDSS 
sample cross-correlations (w X f,Wxb) contain little information on 
the HOD of the SDSS sample compared to the SDSS internal cor- 
relation functions used in the first step of the fitting procedure. This 
assumption is well motivated by statistical uncertainties as the color 
selected samples are over an order of magnitude smaller than the 
SDSS reference samples. We propagate correlated uncertainties in 
the HOD model parameters for the SDSS reference sample to the 
HOD of the color bin sample by marginalizing over 15 randomly 
chosen models for the SDSS HOD. 

Specifically, we compute the^ 2 as 



X 1 = (wf a - W'™ del ) Cov- 1 (W y ) (wf a - W 
+ (i4 ata - n™ odcl ) Cov -1 (n y ) (nf a - ri 



model \ 



model \ 

y 



(25) 



where Y e {S,X}, with galaxy number densities n 5 = (rtf, n^) or 
tlx = n x , and with the statistical error on the number densities 
Cov(ny) estimated from field to field variations. The HOD parame- 
ter space is explored using a Markov Chain Monte Carlo method 
with a multi-variate Gaussian proposal function and flat priors 
(log^M^logmMf.logjoMi.logjoM;) 6 [11, 17], {cr logM ,o- x } e 
[0.05, 1.0], [a x ,f x ] e [0.5,2.0],andlog 10 M e [8, 15]. At each step 
a new set of HOD parameters is always accepted if ^ 2 CW < x^,w an d 
it is accepted with probability exp(-(^ 2 cw -^ 2 ld )/2) if ^ 2 CW > x 2 M - 
The typical chain length is 20000, and we compare 10 chains of 
length 20000 and one chain of length 100000 to test convergence. 



5.4 Results 

Our best-fit HOD model parameters for the SDSS samples and their 
marginalized lcr errors are given in Tab. [4] Our results agree well 



4 Also note that even if one was willing to discard most of the SDSS data, 
obtaining an invertible joint covariance for the five different correlation 
functions, sampled with N\, m radial bins, would require dividing the joint 
footprint into more than 5 X iVy n equal-area jack knife regions N su ^. Ad- 
ditionally, the correction factor required to obtain an unbiased estimate of 
the inverse covariance scales as the ratio of the number of bins (data vector 
variables) to the number of data sets I Hartlap, Simon and Schneider|2007) , 
resulting either in very large error bars (N su t, ~ 5Wbi n ) or restricting the 
analysis to very small scales (/V sl ,b 5Nt,- m ). 



with the corresponding luminosity threshold samples in the anal- 
ysis of |Zehavi etH] |20TT) , and we confirm the overall trend of 
characteristic halo masses for hosting central and satellite galax- 
ies with luminosity threshold. For a detailed comparison note that 
these two analyses use different fitting formulae for the halo mass 
function, halo bias and halo mass - concentration relations. 

Based on these HOD models for the SDSS reference sam- 
ples, we now turn to the color selected galaxy samples. Figure [TS] 
shows the measured cross-correlation functions between color sam- 
ples and the SDSS reference samples, the best-fit model correlation 
functions, and the best-fit halo occupation distribution. For compar- 
ison, we also show the properly weighted sum of the color sample 
HOD models, and the best-fit HOD for the color independent sam- 
ple of SDSS galaxies in the sample luminosity bin. While the char- 
acteristic mass scales of these HODs are similar, such comparisons 
are limited by the large degeneracies between fit parameter^ Over- 
all, these models provide acceptable fits to the measured correlation 
functions, with an exception for the green and red galaxy samples 
in luminosity bin [-20.5, —21.0]. These correlation functions have 
an unusual flat shape, do not show the characteristic transition from 
one-halo to two-halo term regime, and the typical host halo masses 
inferred from the two-halo regime are significantly larger than those 
inferred from the one-halo term only. As discussed in section [43] 
the redshift of this luminosity bin is centered on the Sloan Great 
Wall, which is contained almost completely in the angular mask 
of the SDSS-GALEX cross-match. Hence the clustering measure- 
ments in this luminosity bin may be affected by the overdense en- 
vironment, and as the great Wall occupies a disproportionally large 
fraction of the combined footprint, the jack knife error bars may 
underestimate the sample variance. For comparison we show the 
cross-correlation functions of (g - r) color identified red galaxies 
in this luminosity bin computed over the full SDSS area and the 
combined survey footprint in Fig.^] The clustering of (NUV - r) 
and (g - r) selected red galaxies in the joint survey geometry is 
nearly indistinguishable, while the cross-correlation function of red 
galaxies in this luminosity bin over the full SDSS area has the ex- 
pected shape. It can be fit with a color bin HOD model with reduced 



X = 3.2, suggesting that the poor fit in Fig. 15 is indeed caused by 



the Great Wall structure and not a systematic effect in the construc- 
tion of the (NUV - r) selected galaxy sample. 

Figure [T7] and Tab. [5] show marginalized constraints on the 
mean mass of halos hosting a central galaxy of given color and lu- 
minosity (which is different from Mf, as it also depends on the 
scatter cr x ), satellite fraction, and HOD derived galaxy bias for 
color and luminosity bin samples based on the parameterization de- 
scribed in section |5.1.2| We show these derived quantities instead 
of the HOD parameters as they are less affected by degeneracies 
between the fit parameters, which cause large marginalized errors 
in the individual fit parameters. 

Based on this simple parameterization, we find red central 
galaxies to occupy more massive halos than the average central 
galaxy from the same luminosity bin. Within the statistical uncer- 
tainty due to the small size of our color selected galaxy samples, 
there is no significant difference between the halo masses of blue 
and green central galaxies. At fixed luminosity, the satellite frac- 



5 Ideally, one would fit all three color samples simultaneously and use the 
sum of the three color sample HODs to fit the correlation functions of the 
color independent luminosity bin sample. However, the survey area of our 
current sample is not sufficient to estimate the large covariance matrices 
required for such an analysis. 
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Figure 15. Each row shows the measured correlation functions and best-fit HOD of (NUV-r) selected galaxy samples for one luminosity bin. The left/middle 
panel show the cross-correlation measurements using the faint/bright sample and their joint fit. We list the reduced x 2 of these fits in the middle panel. The 
right panel shows the color sample HOD derived from fitting these cross-correlation functions, the sum of all the color samples, and the best-fit HOD of all 
SDSS galaxies in the same luminosity bin. 
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Table 4. best-fit HOD model parameters for SDSS samples 
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Figure 17. Derived HOD parameters for luminosity and color bin samples. Left: Mean halo mass for a halo have a central galaxy from a particular sample. 
Middle: Satellite fraction as a function of galaxy luminosity and color. Right: Galaxy bias derived from the HOD model fit. 



Table 5. best-fit HOD model derived parameters, and their correlation coefficients, p, for color selected galaxy samples 
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Figure 16. cross-correlation functions of red galaxies in luminosity bin 
[-20.5,-21.0] for different survey areas. The dashed line are the cross- 
correlation functions with all (g - r) > 0.85 galaxies in SDSS in this 
magnitude bin, the dotted line restricts the SDSS red galaxies to the com- 
bined footprint, and the solid line shows the cross-correlation function for 
(NUV - r) identified red galaxies. 



tion and HOD derived galaxy bias increases with (NUV - r) color. 
The former is consistent with the results of |Zehavi et al.|f2011) 
who found the satellite fraction to vary smoothly with (g - r) color 
at fixed luminosity. This analysis used a one-parameter family of 
models based on the HOD of the color independent luminosity 
threshold sample with only the normalization of the satellite galaxy 
occupation function as a free parameters. Note that given the sim- 
ilarities in central galaxy halo masses, differences in the HOD de- 
rived bias parameters mainly reflect the changes in the mean halo 
mass for satellite galaxies. This implies that the host halo masses of 
green satellite galaxies are intermediate between those of blue and 
red satellite galaxies. 

Overall, we find the slope of the satellite occupation 
distribution,^, and radial concentration parameter, c x to increase 
with (NUV - r) color. However, the degeneracies between HOD 
parameters are large and do not allow us to put reliable constraints 
on their luminosity dependence. 

For luminosity bin [-20.5,-21.0] we also show results de- 
rived from (g - r) selected red galaxies in the full SDSS area to 
indicate the impact of the Sloan Great Wall. In the Great Wall the 
satellite fraction and halo mass of red galaxies is increased com- 
pared to a more representative survey volume, as expected from 
the color-density relation. As the (NUV — r) color selected sam- 
ples in this luminosity bin are subject to increased sample variance, 
the results for blue and green galaxies in this luminosity bin should 
similarly be interpreted with caution. 

As noted by |Martin et al] \2QQ1) and |Salim et <A~\ \2QQ1) , a 
large fraction of active galactic nuclei (AGN) have green (NUV—r) 
colors. These galaxies may be transitional galaxies with star for- 
mation being quenched by AGN feedback (e.g., after undergoing 
a major merger, Springel, Di Matteo and Hernquist 2005 1, or red 
sequence interlopers which appear green due to the NUV AGN 
continuum emission. We test whether the intermediate clustering 
of green valley galaxies is caused by AGN, which may be a differ- 
ent population than the non-AGN transitional galaxies. We identify 
green AGN through emission line diagra ms (|Baldwin, Phillips and| 
|Terlevich|[T98l] > using the |Kewley eTal] pOOlf extreme starburst 
classification line. We use the emission line measurements from 
the MPA-JHU catalog and require a signal-to-noise S IN > 3 in 
the emission lines. Our goal is to remove any potential AGN con- 
tamination from the green valley galaxy sample, and we remove all 
galaxies which are classified as AGN in at least one of the three 



diagrams as this allows us to categorize galaxies which do not meet 
the S IN threshold for all emission line. Repeating our clustering 
and HOD analysis for non-AGN green galaxies we find the HOD 
of green non-AGN galaxies to be indistinguishable of that of green 
galaxies including AGN, in agreement with trends earlier observed 
by l |Li et al.|20"06||Heinis et al.|2009| >. We do not show results de- 
rived from HOD fits for the non-AGN green valley galaxies in lu- 
minosity bin [-21.0,-21.5] as this sample is too small to obtain 
stable covariances. 



6 SUMMARY AND DISCUSSION 

We introduced a new analysis and HOD modeling technique for 
galaxy cross-correlation functions using multiple tracer popula- 
tions. This approach is particularly useful for interpreting the clus- 
tering of sparse and/or luminosity bin selected galaxy samples of 
interest. It is advantageous for the analysis of sparse galaxy samples 
as considering the cross-correlation function with more abundant 
galaxy populations significantly reduces the statistical uncertainty. 

While the galaxy number density provides strong constraints 
on the HOD of luminosity threshold samples, the HOD of lumi- 
nosity bin samples is independent of the galaxy abundance; in this 
case considering the cross-correlation with multiple tracer popula- 
tions is particularly useful as it provides an additional mass scale 
for the calibration of the luminosity bin HOD. An additional ad- 
vantage of this method is that modeling the CCF between a color 
selected sample and a color independent sample does not require 
assumptions on the correlation between central and satellite colors. 

This allows us to constrain the central galaxy HOD of color 
and luminosity bin selected samples for the first time. We apply 
this multiple tracer technique to analyze the clustering of (NUV-r) 
color selected blue, red, and green valley galaxy samples. Our key 
result is that halo mass of central galaxies, satellite fraction, and 
halo mass of satellite galaxies increase with (NUV - r) color at 
fixed luminosity. 

While our results indicate that the clustering properties of 
green valley galaxies are consistent with them being an interme- 
diate population between blue and red galaxies, the (NUV — r) se- 
lected green valley galaxy samples in this analysis consist of only 
about one thousand galaxies and are too small to provide insight 
on the transition mechanism(s) at work. In particular, the HOD pa- 
rameters which describe the abundance and distribution of color 
selected satellite galaxies, i.e. the slope of the satellite occupation 
function, ax, and the color dependence of the radial satellite dis- 
tribution, Cx, are poorly constrained by the data. With data from 
future galaxy redshift surveys, these parameters will provide infor- 
mation on the efficiency of star formation quenching as a function 
of halo mass and location within a halo. Furthermore, larger data 
sets will enable a detailed measurement of the redshift space cor- 
relation function and thus enable constraints on the infall stage and 
satellite orbits of transitional galaxies. 

The reduced x 1 values of the best-fit HODs in our analysis of 
color selected galaxy samples are relatively large, and our model 
is particularly insufficient to reproduce the clustering of galaxies in 
or near the Sloan Great Wall. Overall, it is not surprising that a five 
parameter HOD model does not fully describe the color dependent 
clustering of galaxies. While the HOD formalism works well to de- 
scribe the overall relation between (color independent) galaxies and 
their halos, it is questionable if the strong assumptions implicit in 
the HOD formalism, such as the one-to-one relation between halo 
mass an bias, hold for each sub-population. Additionally, numeri- 
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cal and observational results indicate that the influence of massive 
halos may extend beyond ^200, e.g., through highly eccentric satel- 
lite orbits (Benson 2005 ; Wetzel, Tink er and Conroy|201 1 1 and in- 



fall related shocks extending beyond the virial radius (e.g., Balogh, 
|Navarro and Morris 2000 ), which is not easily incorporated in halo 
models. 

Finally we note that Behroozi, Conroy and Wechsler (2010); 
|Leauthaud et al.| ( |2011| > recently proposed an improved HOD pa- 
rameterization based on a detailed model for the relation between 
stellar mass and halo mass. Their results (figure 3 in |Leauthaud| 
|et al.| (201 1| |) indicate that halo masses derived from the HOD 
parameterization for luminosity threshold samples adopted in our 
analysis (equation |6j) may be biased by up to 40%, with the main 
source of this discrepancy being the assumptions of a power-law 
form and constant scatter for the luminosity-halo mass relation. For 
luminosity bin samples, however, these assumptions are better jus- 
tified, and we expect only small discrepancies between different 
HOD parameterizations. 
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